Up until now, to visualize data in my own research, I would obtain means and standard errors in SPSS, then plot the data in Excel. This practice isn’t exactly what I’d call ideal. So, the purpose of this post is to demonstrate various methods of visualizing uncertainty in R, with a particular emphasis on visualizing pupillometry data.
The data-set I will be using for this demonstration is de-identified and already hosted on Github. The data come from one of my recent publications (Experiment 1; Miller, Gross, & Unsworth, 2019). In this paper, pupil dilation was used as an index of the intensity of attention to determine if variation in attention at encoding partially accounts for the relation between working memory capacity (WMC) and long-term memory performance (LTM). Participants (N = 138) completed a battery of complex span working memory tasks, followed by a delayed free recall task while pupil dilation was simultaneously recorded.
Before beginning, let’s first load and inspect the data:
data <- import(here("data", "DeIntentifiedJML2019Data_Exp1.sav"),
setclass = "tibble") %>%
characterize() %>%
janitor::clean_names()
head(data)
## # A tibble: 6 x 183
## subject sp1_acc sp2_acc sp3_acc sp4_acc sp5_acc sp6_acc sp7_acc sp8_acc
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 1 0.8 1 1 0.8 1 0.2 0.4
## 2 2 0.6 0.4 1 0.8 0.8 0.6 0.8 0.6
## 3 3 1 1 0.4 0.6 0 0.8 0.4 0.4
## 4 4 1 1 0.8 0.8 0.4 0.6 0.2 0.6
## 5 5 0.8 1 0.6 0.4 0.6 0.4 0.4 0.2
## 6 6 1 0.8 0.6 0.6 0.2 0.2 0.4 0
## # … with 174 more variables: sp9_acc <dbl>, sp10_acc <dbl>,
## # passive_read_strat <dbl>, rehearsal_strat <dbl>,
## # sentence_gen_strat <dbl>, imagery_strat <dbl>, grouping_strat <dbl>,
## # other_strat <dbl>, accuracy_mean <dbl>, ospan <dbl>, rspan <dbl>,
## # symspan <dbl>, wmc <dbl>, span_group <chr>, ineffective <dbl>,
## # effective <dbl>, primacy_recall <dbl>, middle_recall <dbl>,
## # recency_recall <dbl>, baseline_pupil_mean <dbl>, tepr_mean <dbl>,
## # tepr_primacy <dbl>, tepr_mid <dbl>, tepr_recency <dbl>,
## # ebin1w1pt4_mean <dbl>, ebin2w1pt4_mean <dbl>, ebin3w1pt4_mean <dbl>,
## # ebin4w1pt4_mean <dbl>, ebin5w1pt4_mean <dbl>, ebin6w1pt4_mean <dbl>,
## # ebin7w1pt4_mean <dbl>, ebin8w1pt4_mean <dbl>, ebin9w1pt4_mean <dbl>,
## # ebin10w1pt4_mean <dbl>, ebin11w1pt4_mean <dbl>,
## # ebin12w1pt4_mean <dbl>, ebin13w1pt4_mean <dbl>,
## # ebin14w1pt4_mean <dbl>, ebin15w1pt4_mean <dbl>, ebin1w2pt4_mean <dbl>,
## # ebin2w2pt4_mean <dbl>, ebin3w2pt4_mean <dbl>, ebin4w2pt4_mean <dbl>,
## # ebin5w2pt4_mean <dbl>, ebin6w2pt4_mean <dbl>, ebin7w2pt4_mean <dbl>,
## # ebin8w2pt4_mean <dbl>, ebin9w2pt4_mean <dbl>, ebin10w2pt4_mean <dbl>,
## # ebin11w2pt4_mean <dbl>, ebin12w2pt4_mean <dbl>,
## # ebin13w2pt4_mean <dbl>, ebin14w2pt4_mean <dbl>,
## # ebin15w2pt4_mean <dbl>, ebin1w3pt4_mean <dbl>, ebin2w3pt4_mean <dbl>,
## # ebin3w3pt4_mean <dbl>, ebin4w3pt4_mean <dbl>, ebin5w3pt4_mean <dbl>,
## # ebin6w3pt4_mean <dbl>, ebin7w3pt4_mean <dbl>, ebin8w3pt4_mean <dbl>,
## # ebin9w3pt4_mean <dbl>, ebin10w3pt4_mean <dbl>, ebin11w3pt4_mean <dbl>,
## # ebin12w3pt4_mean <dbl>, ebin13w3pt4_mean <dbl>,
## # ebin14w3pt4_mean <dbl>, ebin15w3pt4_mean <dbl>, ebin1w4pt4_mean <dbl>,
## # ebin2w4pt4_mean <dbl>, ebin3w4pt4_mean <dbl>, ebin4w4pt4_mean <dbl>,
## # ebin5w4pt4_mean <dbl>, ebin6w4pt4_mean <dbl>, ebin7w4pt4_mean <dbl>,
## # ebin8w4pt4_mean <dbl>, ebin9w4pt4_mean <dbl>, ebin10w4pt4_mean <dbl>,
## # ebin11w4pt4_mean <dbl>, ebin12w4pt4_mean <dbl>,
## # ebin13w4pt4_mean <dbl>, ebin14w4pt4_mean <dbl>,
## # ebin15w4pt4_mean <dbl>, ebin1w5pt4_mean <dbl>, ebin2w5pt4_mean <dbl>,
## # ebin3w5pt4_mean <dbl>, ebin4w5pt4_mean <dbl>, ebin5w5pt4_mean <dbl>,
## # ebin6w5pt4_mean <dbl>, ebin7w5pt4_mean <dbl>, ebin8w5pt4_mean <dbl>,
## # ebin9w5pt4_mean <dbl>, ebin10w5pt4_mean <dbl>, ebin11w5pt4_mean <dbl>,
## # ebin12w5pt4_mean <dbl>, ebin13w5pt4_mean <dbl>,
## # ebin14w5pt4_mean <dbl>, ebin15w5pt4_mean <dbl>, ebin1w6pt4_mean <dbl>,
## # …
There are many variables in this data-set that I won’t need for the purpose of this demonstration. Some of the variable names are also a bit confusing. So, I need to select the key variables of interest and rename the remaining variables:
# Select variables of interest for pupil data
pupil_data <- data %>%
select(-sp1_acc:-symspan,
-ineffective:-recency_recall,
-baseline_pupil_mean:-tepr_recency)
# Cam's efficient method to rename bin/word variables
# E.g., data currently reads ebin1w1pt4_mean
# I just want the variable to read bin1w1
pupil_data %<>%
rename_at(
vars(starts_with("ebin")),
funs(
paste(
str_extract(., "w\\d{1,2}"),
"_",
str_extract(., "bin\\d{1,2}"),
sep = "")))
The first plot shows changes in pupil diameter across the 3 second study/encoding phase for each word (i.e., bin). As a starting point, see Fig. 4 (made in Excel) in Miller et al. (2019) below:
While this plot is perfectly fine for publication purposes, I intend to enhance this plot in a number of ways. But, namely, I want to map uncertainty (previously shown with the standard error bars) via bootstrapping/generation of outcome draws from a fitted model. For more details on this method, I highly recommend reading up on Claus O. Wilke’s ungeviz package. Another good reference can be located here.
The intended audience for this graph is the scientific community, but it is also my hope that interested laypersons may also be able to understand it. The message to be communicated is that when instructed to study a list of words for a later test, pupil diameter increases throughout the study phase for each word. This increase in pupil dilation is believed to reflect an increase in the amount of attentional effort devoted to a given item.
Data prep and code for plot:
# Tidy data:
plot1_data <- pupil_data %>%
gather(key = word_bin, value = TEPR, w1_bin1:w10_bin15) %>%
separate(word_bin, c("word", "bin"), sep = "_") %>%
mutate(word = parse_number(word),
bin = parse_number(bin)) %>%
arrange(subject)
#glimpse(plot1_data)
plot1 <- plot1_data %>%
group_by(bin) %>%
summarise(mean_TEPR = mean(TEPR, na.rm = TRUE),
se_TEPR = sundry::se(TEPR))
# This just makes the line begin at x = 0 and y = 0
plot1 <- rbind(plot1, "1st" = c(0, 0, 0))
# I want to model uncertainty with bootstrapping:
row_samps <- rerun(100,
sample(seq_len(nrow(plot1)),
nrow(plot1),
replace = TRUE))
# Extracting samples
d_samps <- map_df(row_samps, ~plot1[., ], .id = "sample")
# Plotting both data sources (my data and hypothetical/bootstrapped data)
plot1_academic <- ggplot(plot1, aes(x = bin, y = mean_TEPR)) +
stat_smooth(aes(group = sample),
data = d_samps,
geom = "line",
color = "gray60",
fullrange = TRUE,
size = 0.1) +
geom_smooth(color = "black", se = FALSE, size = 0.9) +
theme_bw(base_size = 12) +
# Change labels
labs(caption = "Figure 1. Task evoked pupillary response across the 3 second study (encoding) phase for each word",
y = "Change in Pupil Diameter (mm)",
x = "Time (ms)") +
# Relabel x axis values to make more sense to reader
scale_x_continuous(breaks = c(0, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15),
labels = c("0", "", "400", "", "800", "",
"1,200", "", "1,600", "", "2,000",
"", "2,400", "", "2,800", ""),
limits = c(0, NA)) +
# Add text specifying when stimuli appear onscreen
geom_text(y = 0.04,
x = 1.4,
color = "black",
label = " To-be-remembered
word appears onscreen",
size = 4,
fontface = 2,
family = "Times") +
# Add dotted vertical line at x = 0
geom_vline(aes(xintercept = 0),
color = "gray50",
lty = "dashed")
academic_theme <- theme(plot.subtitle = element_text(family = "Times"),
plot.caption = element_text(family = "Times",
size = 12,
hjust = 0),
axis.title = element_text(family = "Times",
face = "bold"),
axis.text = element_text(family = "Times"),
axis.text.x = element_text(family = "Times"),
axis.text.y = element_text(family = "Times"),
plot.title = element_text(family = "Times",
face = "bold"),
legend.text = element_text(family = "Times"),
legend.title = element_text(family = "Times"),
strip.text = element_text(family = "Times"),
panel.grid.major = element_line(colour = "white"),
panel.grid.minor = element_line(colour = "white"))
# Apply an academic theme to plot
plot1_academic + academic_theme
Here I apply my own theme:
# Plotting both data sources (my data and hypothetical/bootstrapped data)
plot1_fun <- ggplot(plot1, aes(x = bin, y = mean_TEPR)) +
stat_smooth(aes(group = sample),
data = d_samps,
geom = "line",
color = "#2DDADA",
fullrange = TRUE,
size = 0.1) +
geom_smooth(color = "magenta", se = FALSE, size = 0.9) +
theme_minimal(base_size = 12) +
# Change labels
labs(caption = "Figure 1. Task evoked pupillary response across the 3 second study (encoding) phase for each word",
y = "Change in Pupil Diameter (mm)",
x = "Time (ms)") +
# Relabel x axis values to make more sense to reader
scale_x_continuous(breaks = c(0, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15),
labels = c("0", "", "400", "", "800", "",
"1,200", "", "1,600", "", "2,000",
"", "2,400", "", "2,800", ""),
limits = c(0, NA)) +
# Add text specifying when stimuli appear onscreen
geom_text(y = 0.04,
x = 1.6,
color = "gray80",
label = " To-be-remembered
word appears onscreen", size = 4, fontface = 2) +
# Add dotted vertical line at x = 0
geom_vline(aes(xintercept = 0),
color = "gray80",
lty = "dashed")
bbg_darktheme <- theme(panel.grid.major = element_line(colour = "gray10"),
panel.grid.minor = element_line(colour = "gray10"),
axis.text = element_text(colour = "gray80"),
axis.text.x = element_text(colour = "gray80"),
axis.text.y = element_text(colour = "gray80"),
axis.title = element_text(colour = "gray80",
face = "bold"),
axis.ticks = element_line(colour = "gray80"),
axis.line = element_line(colour = "gray80"),
legend.text = element_text(colour = "gray80"),
legend.title = element_text(colour = "gray80"),
plot.subtitle = element_text(colour = "gray80"),
strip.text = element_text(colour = "gray80",
face = "bold"),
panel.background = element_rect(fill = "gray10"),
plot.background = element_rect(fill = "gray10"),
legend.background = element_rect(fill = NA,
color = NA),
plot.margin = margin(10, 10, b = 20, 10),
plot.caption = element_text(colour = "gray80",
vjust = 1,
hjust = 0,
size = 10.5),
plot.title = element_text(colour = "gray80",
face = "bold"))
plot1_fun + bbg_darktheme
The second plot shows changes in pupil diameter during study as a function of serial position and one’s WMC. See Fig. 5 in Miller et al. 2019 below:
Again, the intended audience for this graph is the scientific community. The message to be communicated is that different patterns of pupil dilation across serial positions emerge based on one’s working memory capacity (WMC). Namely, for high WMC individuals (n = 33), pupil diameter increases as each new word is introduced during the learning phase of the task. Alternatively, for low WMC individuals (n = 31), pupil diameter decreases as each new word is introduced during learning.
Here I re-create this plot but map uncertainty via geom_ribbon():
#Reordering factor levels for legend
plot1_data$span_group <- factor(plot1_data$span_group,
levels = c("Low", "Medium", "High"))
plot2data <- plot1_data %>%
group_by(word, span_group) %>%
summarise(mean_TEPR = mean(TEPR, na.rm = TRUE),
se_TEPR = sundry::se(TEPR))
# Rename variable for legend
plot2data %<>%
rename("WMC" = `span_group`)
# Rename variables to enhance clarity
plot2data %<>%
mutate(WMC = recode(WMC,
'Low' = "Low WMC",
'Medium' = "Medium WMC",
'High' = "High WMC"))
# Plot
plot2_academic <- ggplot(plot2data, aes(x = word, y = mean_TEPR)) +
geom_ribbon(aes(ymin = mean_TEPR - se_TEPR,
ymax = mean_TEPR + se_TEPR,
fill = WMC),
alpha = 0.8) +
scale_fill_grey() +
# Add reference line for means of each group
geom_line(aes(colour = WMC), size = 1) +
scale_colour_grey() +
# Add labels for each WMC group
geom_text(data = filter(plot2data, word == '10'),
aes(y = mean_TEPR, label = WMC),
color = "black",
nudge_x = .20,
hjust = 0,
size = 4,
family = "Times",
fontface = 2) +
theme_bw(base_size = 12) +
# Left-align caption
theme(legend.position = "none") +
# Change labels
labs(caption = "Figure 2. Pupillary response across serial positions for low working memory capacity
(WMC) individuals (n = 31), medium WMC individuals (n = 69), and high WMC
individuals (n = 33).",
y = "Change in Pupil Diameter (mm)",
x = "Serial position",
colour = "") +
# Relabel x axis values to make more sense to reader
scale_x_continuous(breaks = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
labels = c("1", "2", "3", "4", "5",
"6", "7", "8", "9", "10"),
limits = c(1, 12))
plot2_academic + academic_theme
Now I add some color and apply my own personal theme:
#Specify colors for my own color palette
outrun <- c("violetred4", "pink", "turquoise4")
#Plot
plot2_fun <- ggplot(plot2data, aes(x = word, y = mean_TEPR)) +
geom_ribbon(aes(ymin = mean_TEPR - se_TEPR,
ymax = mean_TEPR + se_TEPR,
fill = WMC),
alpha = 0.4) +
scale_fill_manual(values = outrun) +
geom_line(aes(colour = WMC), size = 1) +
scale_colour_manual(values = outrun) +
# Add labels for each WMC group
geom_text(data = filter(plot2data, word == '10'),
aes(y = mean_TEPR, label = WMC),
color = "gray80",
nudge_x = .20,
hjust = 0,
size = 4,
fontface = 2) +
theme_minimal(base_size = 12) +
# Left-align caption
theme(legend.position = "none") +
# Change labels
labs(caption = "Figure 2. Pupillary response across serial positions for low working memory capacity
(WMC) individuals (n = 31), medium WMC individuals (n = 69), and high WMC
individuals (n = 33).",
y = "Change in Pupil Diameter (mm)",
x = "Serial position",
colour = "") +
# Relabel x axis values to make more sense to reader
scale_x_continuous(breaks = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
labels = c("1", "2", "3", "4", "5",
"6", "7", "8", "9", "10"),
limits = c(1, 12))
plot2_fun + bbg_darktheme
The third plot shows changes in pupil diameter as a function of serial position (broken down into Primacy (words 1–3), Mid (words 4–7), and Recency (words 8–10)) and bin (time across the 3 second study phase for each word) based on one’s WMC. See Fig. 6 in Miller et al. 2019 below (Note that error bars aren’t even provided!):
The intended audience is yet again the scientific community, and the plot essentially conveys the same information as the plot above (plot #2). That is, when instructed to study a list of words for a later test, two factors seem to influence how attention is allocated across the study phase for each word (as indexed via pupil dilation): (1) one’s WMC and (2) serial position. For high WMC individuals, pupil dilation continues to gradually increase throughout the encoding period for all serial positions, with primacy items (the first few words presented during list presentation; words 1-3) displaying smaller dilations than mid (words 4–7) and recency items (the last few words presented during list presentation; words 8–10). Conversely, low WMC individuals show moderate increases in dilation that appear to plateau near the middle of the encoding period. Moreover, pupil dilation appears to be largest for primacy items and smallest for recency items, despite a gradual increase in dilation for recency items.
Here I re-create this plot but map uncertainty/standard errors via geom_smooth():
# Converting word # to factors
plot1_data$word <- as.factor(plot1_data$word)
# Breaking down words into primacy, mid, and recency items
plot3_data <- plot1_data %>%
mutate(word = recode(word, '1' = "Primacy Items",
'2' = "Primacy Items",
'3' = "Primacy Items",
'4' = "Mid Items",
'5' = "Mid Items",
'6' = "Mid Items",
'7' = "Mid Items",
'8' = "Recency Items",
'9' = "Recency Items",
'10' = "Recency Items"))
plot3_data %<>%
group_by(word, bin, span_group) %>%
summarise(mean_TEPR = mean(TEPR, na.rm = TRUE),
se_TEPR = sundry::se(TEPR))
# Specify factor levels
plot3_data$word <- factor(plot3_data$word,
levels = c("Primacy Items",
"Mid Items",
"Recency Items"))
# Rename variables to enhance clarity
plot3_data %<>%
mutate(span_group = recode(span_group,
'Low' = "Low WMC",
'Medium' = "Medium WMC",
'High' = "High WMC"))
plot3_academic <- ggplot(plot3_data, aes(x = bin, y = mean_TEPR, colour = word)) +
geom_smooth(aes(linetype = word)) +
scale_linetype_manual(values=c("dotted", "twodash", "solid"),
name ="Serial Position",
breaks=c("Primacy Items", "Mid Items", "Recency Items"),
labels=c("Primacy", "Mid", "Recency")) +
scale_colour_manual(values = c("gray0", "gray50", "gray75"),
name ="Serial Position",
breaks=c("Primacy Items", "Mid Items", "Recency Items"),
labels=c("Primacy", "Mid", "Recency")) +
facet_wrap(~span_group) +
theme_bw(base_size = 12) +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
plot.subtitle = element_text(face = "bold", hjust = 0.5),
strip.text = element_text(size = 11),
legend.position = "bottom") +
# Change labels
labs(caption = "Figure 3. Pupil diameter as a function of serial position and time across encoding period for low WMC (n = 31), medium WMC
(n = 69), and high WMC (n = 33) individuals. Serial position was broken down into Primacy (words
1–3), Mid (words 4–7), and Recency (words 8–10) for graphical purposes only.",
y = "Changes in Pupil Diameter (mm)",
x = "Time (ms)") +
# Adjust/Relabel x axis values to make more sense to reader
scale_x_continuous(breaks = c(0, 2, 4, 6, 8, 10, 12, 14),
labels = c("0", "400", "800", "1,200",
"1,600", "2,000", "2,400","2,800"),
limits = c(0, NA)) +
# Adjust/Relabel y axis values to make more sense to reader
scale_y_continuous(breaks = c(-0.10, -0.05, 0.00, 0.05, 0.10, 0.15),
label = c("-0.10", "-0.05", "0.00",
"0.05", "0.10", "0.15"),
limits = c(-0.10, 0.15))
plot3_academic + academic_theme
This time, not only am I going to apply my own theme, but I will map uncertainty with a hypothetical outcome plot (HOP). Basically, like Plot 1, uncertainty is mapped via generation of outcome draws from a fitted model. HOPS just animate this process so that the audience can observe changes across hypothetical samples.
plot3_fun <- ggplot(plot3_data, aes(x = bin, y = mean_TEPR, colour = word)) +
# Generate outcome draws from a fitted model
stat_smooth_draws(times = 10,
aes(group = interaction(stat(.draw), colour)),
size = 0.8) +
scale_color_manual(values = outrun) +
#scale_color_carto_d(palette = "Burg") +
# Create seperate plots for people with low, mid, and high WMC
facet_wrap(~span_group) +
# Specifying sampled draws in addition to the animated lines
transition_states(stat(.draw), 1, 2) +
enter_fade() + exit_fade() +
shadow_mark(future = TRUE, size = 0.25, color = "gray50", alpha = 1/4) +
theme_minimal() +
# Rotate x axis values so they are angled; center subtitle
# Left-align caption
theme(axis.text.x = element_text(angle = 45, hjust = 1),
plot.subtitle = element_text(face = "bold", hjust = 0.5),
strip.text = element_text(size = 11)) +
# Change labels
labs(caption = "Figure 3. Pupil diameter as a function of serial position and time across encoding period for low WMC
(n = 31), medium WMC (n = 69), and high WMC (n = 33) individuals. Serial position was broken down
into Primacy (words 1–3), Mid (words 4–7), and Recency (words 8–10) for graphical purposes only.",
y = "Mean Pupil Diameter (mm)",
x = "Time (ms)",
colour = "Serial Position") +
# Adjust/Relabel x axis values to make more sense to reader
scale_x_continuous(breaks = c(0, 2, 4, 6, 8, 10, 12, 14),
labels = c("0", "400", "800", "1,200",
"1,600", "2,000", "2,400","2,800"),
limits = c(0, NA)) +
# Adjust/Relabel y axis values to make more sense to reader
scale_y_continuous(breaks = c(-0.10, -0.05, 0.00, 0.05, 0.10, 0.15),
label = c("-0.10", "-0.05", "0.00",
"0.05", "0.10", "0.15"),
limits = c(-0.10, 0.15))
plot3_fun + bbg_darktheme
This final plot is meant to be more digestible for the general public; it doesn’t examine pupillometry data. Rather, the plot shows the relation between recall accuracy and strategy type. The intended message to be communicated is that strategies that rely on more elaborative mental processes are associated with better memory performance.
Here I prep the data and create a standard, but fancy-ish, bar-plot:
# Select variables of interest
strategy_data <- data %>%
select(-sp1_acc:-sp10_acc,
-ospan:-span_group,
-primacy_recall:-ebin15w10pt4_mean) %>%
gather(key = strat_type, value = response, passive_read_strat:other_strat)
#levels(as.factor(strategy_data$strat_type))
# Filter out NAs, convert strategy type and score to factors, and recode strat
strategy_data %<>%
filter(response != "NA") %>%
mutate(strat_type = as.factor(strat_type),
response = as.factor(response),
strat_type = recode(strat_type,
"grouping_strat" = "Grouping",
"imagery_strat" = "Imagery",
"other_strat" = "Other",
"passive_read_strat" = "Passive Reading",
"rehearsal_strat" = "Rote Rehearsal",
"sentence_gen_strat" = "Sentence Generation"))
# Obtain summary stats
strategy_data %<>%
group_by(strat_type, response) %>%
summarise(mean_acc = mean(accuracy_mean, na.rm = TRUE),
se_acc = sundry::se(accuracy_mean))
# Specify factor levels
# Effective strategies = imagery, sentence generation, and grouping
# Ineffective strategies = rehearsal and passive reading
strategy_data$strat_type <- factor(strategy_data$strat_type,
levels = c("Imagery",
"Sentence Generation",
"Grouping",
"Rote Rehearsal",
"Passive Reading",
"Other"))
plot4 <- strategy_data %>%
#filter out other and grouping strategy
filter(strat_type != "Other" & strat_type != "Grouping") %>%
ggplot(aes(response, mean_acc, fill = response)) +
geom_col(width = 0.60,
alpha = 0.6) +
geom_errorbar(aes(ymin = mean_acc + qnorm(0.025)*se_acc,
ymax = mean_acc + qnorm(0.975)*se_acc),
color = "gray40",
width = 0.4,
size = 0.8) +
# Print accuracy associated with each condition
geom_text(aes(response, mean_acc, label = paste0(round(mean_acc*100), "%")),
nudge_y = 0.15, # Specifies how high above bar text appears
size = 3,
color = "gray80") +
# Provide seperate graphs for each strategy
facet_wrap(~strat_type) +
scale_fill_carto_d(palette = "Burg") +
theme_minimal() +
# Delete legend, left-align caption, center title
theme(legend.position="none",
plot.caption = element_text(hjust = 0),
strip.text = element_text(size = 11),
plot.title = element_text(hjust = 0.5)) +
#Modify labels
labs(title = "Recall Accuracy as a Function of Strategy Type and Use of Strategy",
y = "Mean Recall Accuracy",
x = "",
caption = "'No' represents individuals who did not report using given strategy, whereas 'Yes' represents those who used
said strategy. Note that performance is actually better among people who report not using a normatively
ineffective strategy (e.g., rote rehearsal or passive reading) than when people report using these strategies.") +
# Rename labels on x-axis
scale_x_discrete(breaks = c(0, 1),
labels = c("No", "Yes")) +
# Modify names/range of y-axis
scale_y_continuous(breaks = c(0, 0.2, 0.4, 0.6, 0.8),
labels = c("0%", "20%", "40%", "60%", "80%"),
limits = c(0, 0.8))
# Applying my theme to plot
plot4 + bbg_darktheme
One problem with the above plot, however, is that a general audience may not understand what the error bars represent. So, instead, I’m going to map uncertainty with density stripes. Again, I recommend checking out the ungeviz package for more details.
stripe_colors <- c("#7F38A7", "#42B9BD")
bar_colors <- c("#5F2A7C", "#31898C")
plot4v2 <- strategy_data %>%
#filter out other and grouping strategy
filter(strat_type != "Other" & strat_type != "Grouping") %>%
ggplot(aes(mean_acc, response, group = response, fill = response)) +
# Add density stripes to mark region of standard error
stat_confidence_density(aes(moe = se_acc),
height = 0.5) +
scale_fill_manual(values = stripe_colors) +
# Add vertical line where mean is for each condition
geom_vpline(aes(colour = response),
stat = "summary",
size = 0.9) +
scale_colour_manual(values = bar_colors) +
# Create seperate sections based on strategy type
facet_wrap(~strat_type, nrow = 1) +
# Print accuracy associated with each condition
geom_text(aes(mean_acc, response, label = paste0(round(mean_acc*100), "%")),
nudge_x = 0.005,
nudge_y = 0.3,
size = 3,
color = "gray80") +
theme_minimal() +
# Delete legend, increase size of facet labels, and center title
theme(legend.position = "none",
strip.text = element_text(size = 11),
plot.title = element_text(hjust = 0.5)) +
labs(title = "Recall Accuracy as a Function of Strategy Type and Use of Strategy",
y = "",
x = "Mean Recall Accuracy",
caption = "'No' represents individuals who did not report using given strategy, whereas 'Yes' represents those who used
said strategy. Note that performance is actually better among people who report not using a normatively
ineffective strategy (e.g., rote rehearsal or passive reading) than when people report using these strategies.") +
# Rename labels on x-axis
scale_y_discrete(breaks = c(0, 1),
labels = c("No", "Yes")) +
# Modify names/range of y-axis
scale_x_continuous(breaks = c(0.45, 0.50, 0.55, 0.6, 0.65, 0.70),
labels = c("0.45%", "", "55%" , "", "65%", ""),
limits = c(0.40, 0.70))
plot4v2 + bbg_darktheme
Personally, I think this looks pretty cool.
To review, rather than using traditional error bars to map uncertainty, R provides us with many alternatives, such as bootstrapping/generation of outcome draws from a fitted model, geom_ribbon(), geom_smooth(), HOPS, and density stripes. It is my hope to learn how to also create an interactive plot.. stay tuned.
Thanks for the read! :)